A Probabilistic Model for Learning Concatenative Morphology
نویسندگان
چکیده
This paper describes a system for the unsupervised learning of morphological suffixes and stems from word lists. The system is composed of a generative probability model and hill-climbing and directed search algorithms. By extracting and examining morphologically rich subsets of an input lexicon, the directed search identifies highly productive paradigms. The hill-climbing algorithm then further maximizes the probability of the hypothesis. Quantitative results are shown by measuring the accuracy of the morphological relations identified. Experiments in English and Polish, as well as comparisons with another recent unsupervised morphology learning algorithm demonstrate the effectiveness of this technique.
منابع مشابه
Adaptor Grammars for Learning Non-Concatenative Morphology
This paper contributes an approach for expressing non-concatenative morphological phenomena, such as stem derivation in Semitic languages, in terms of a mildly context-sensitive grammar formalism. This offers a convenient level of modelling abstraction while remaining computationally tractable. The nonparametric Bayesian framework of adaptor grammars is extended to this richer grammar formalism...
متن کاملLearning non-concatenative morphology
Recent work in computational psycholinguistics shows that morpheme lexica can be acquired in an unsupervised manner from a corpus of words by selecting the lexicon that best balances productivity and reuse (e.g. Goldwater et al. (2009) and others). In this paper, we extend such work to the problem of acquiring non-concatenative morphology, proposing a simple model of morphology that can handle ...
متن کاملSemi-supervised induction of a concatenative morphology with simple morphotactics A model in the Morfessor family
متن کامل
Evaluating Sequence Alignment for Learning Inflectional Morphology
This work examines CRF-based sequence alignment models for learning natural language morphology. Although these systems have performed well for a limited number of languages, this work, as part of the SIGMORPHON 2016 shared task, specifically sets out to determine whether these models handle non-concatenative morphology as well as previous work might suggest. Results, however, indicate a strong...
متن کامل